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Information processing at the molecular scale is limited by thermal fluctuations. This can cause 
undesired consequences in copying information since thermal noise can lead to errors that can 
compromise the functionality of the copy. For example, a high error rate during DNA duplication can 
lead to cell death. Given the importance of accurate copying at the molecular scale, it is fundamental 
to understand its thermodynamic features. In this paper, we derive a universal expression for the 
copy error as a function of entropy production and work dissipated by the system during wrong 
incorporations. Its derivation is based on the second law of thermodynamics, hence its validity is 
independent of the details of the molecular machinery, be it any polymerase or artificial copying 
device. Using this expression, we find that information can be copied in three different regimes. In 
two of them, work is dissipated to either increase or decrease the error. In the third regime, the 
protocol extracts work while correcting errors, reminiscent of a Maxwell demon. As a case study, 
we apply our framework to study a copy protocol assisted by kinetic proofreading, and show that 
it can operate in any of these three regimes. We finally show that, for any effective proofreading 
scheme, error reduction is limited by the chemical driving of the proofreading reaction. 
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INTRODUCTION 

Copying information is a fundamental process in the 
natural world: all living systems, as well as the vast ma¬ 
jority of manmade digital devices, need to replicate in¬ 
formation to function properly. The quality of a copy 
relies on it being an accurate reproduction of the origi¬ 
nal and can be quantified by the fraction rj of wrongly 
copied bits that it contains. Errors can be provoked by 
several hardware-specific causes, such as imperfections in 
the copying machinery. At the molecular scale, perfect 
copying does not exist as thermal fluctuations constitute 
a fundamental source of error, regardless of the system. 
Since the reliability of the copying process is ultimately 
limited by thermal noise, it must be understood in terms 
of thermodynamics, as recognized by Von Neumann [T]. 

Therefore, a critical question is whether one can invoke 
the second law of thermodynamics to establish a univer¬ 
sal connection between the error and physical quantities 
characterizing the copy process. This issue should be 
addressed in a general framework, incorporating two ba¬ 
sic features of copying machineries. First, copying pro¬ 
tocols often involve several intermediate discriminatory 
steps used to regulate the accuracy and speed of the pro¬ 
cess. This is a characteristic property of both natural and 
artificial error-correcting protocols. For example, accu¬ 
rate copying of DNA occurs via multistep reactions [2]. 
Second, due to the statistical nature of the second law, 
one should consider cyclically repeated copy operations 
rather than a single one [5]. This cyclical operation is 
also consistent with the behavior of polymerases when 
duplicating long biopolymers. 

To understand the thermodynamics of copying, we in¬ 


troduce a general framework where both the copying pro¬ 
tocol can be arbitrarily complex (as in models describ¬ 
ing biochemical reactions mm) and copy operations are 
cyclically repeated (as in models inspired by the physics 
of polymer growth [HHH])- Our framework describes 
template-assisted growth of a copy polymer (or “tape”, 
see [16]) aided by a molecular machine, see Fig. Gray 
and white circles represent two different monomer types. 
The molecular machine, represented as a red circle in the 
figure, is situated at the tip of the copy strand and tries 
to match freely diffusing monomers with corresponding 
ones on the template. When a free monomer arrives at 
the tip, the machine transitions through a network of in¬ 
termediate states to determine whether to incorporate or 
to reject it. Incorporation is more likely if the matching 
is right, i.e. the color of the monomer matches that of 
the template, than if it is wrong. On average, the copy 
strand elongates at a speed v > 0 and accumulates errors 
with probability 77 . 
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FIG. 1. Template-assisted polymerization. The tem¬ 
plate strand is a pre-existing polymer made up of two differ¬ 
ent kinds of monomers (gray and white circles). A molecu¬ 
lar copying machine (red circle) assists the growth of a copy 
strand by incorporating freely diffusing monomers of two dif¬ 
ferent types, trying to match them with those of the template 
strand. Right and wrong matches are noted r and w. 

Close to thermodynamic equilibrium the process be- 
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FIG. 2. Transition network of template-assisted polymerization and examples. A. State space of the template- 
assisted polymerization model. Monomer incorporation occurs via a network of intermediate states represented inside the 
dashed circles. The two colors distinguish networks leading to incorporation of right and wrong monomers. The structure is 
repeated in a tree-shaped structure as the polymer grows by addition of more and more monomers. B. Examples of networks 
of intermediate states. First example: template-assisted polymerization without intermediate states (see e.g. [8l IlllH^ I15 |f . 
Second example: kinetic proofreading, where after an intermediate state a backwards driven pathway removes errors to improve 
the overall accuracy of the copy HUS]. Third example: mRNA translation, where the three copying steps represent initial 
binding, GTP hydrolysis and final accommodation; a proofreading reaction is also present [in¬ 


comes very slow, t; —>■ 0. The error is then rjeq « 
exp[—(A£'“' — AE'^)/T], determined by the energy 
changes AE'^ and AE'^ of right and wrong monomer in¬ 
corporation and independently of the copying protocol. 
In this case, the error can be reduced by increasing the 
gap (ATI™ —AE’’’), in agreement with Bennett’s idea that 
cyclic copying can be performed near equilibrium with 
arbitrary precision mm- This mechanism is however 
unpractical, for example due to the low speed limitation. 
Instead, typical molecular machines spend chemical en¬ 
ergy to copy at a finite speed and out of thermodynamic 
equilibrium. Non-equilibrium copying protocols can also 
reduce the error far below its equilibrium value. For ex¬ 
ample, the equilibrium estimate for the error in DNA 
duplication is rj^q ^ 10“^, where the actual observed 
error is ~ 10“® [2]. An important non-equilibrium 
mechanism underlying error correction is kinetic proof¬ 
reading, which feeds on chemical energy to preferentially 
undo wrong copies mw- Other non-equilibrium mech¬ 
anisms such as induced fit m and kinetic discrimination 
[MIS] complement kinetic proofreading to underpin the 
high accuracy of replication in biological systems. 

In this work we demonstrate that, for the broad class 
of processes depicted in Fig.[^ a direct relation links copy 
errors with non-equilibrium thermodynamic observables 
characterizing incorporation of errors. In particular, at 
fixed work budget, the error decreases exponentially with 
the total entropy produced per wrongly copied bit. This 


relation is completely general, in contrast with conditions 
setting hardware-specific minimum errors Tymin that char¬ 
acterize each particular copying protocol. When study¬ 
ing wrong matches alone, three copying regimes can be 
identified: error amplification^ where energy is invested 
in increasing the error rate; error correction, where en¬ 
ergy is invested in decreasing the error rate; and Maxwell 
demon, where the information contained in the errors is 
converted into work. We conclude by studying the spe¬ 
cific copying protocol of kinetic proofreading. We show 
that proofreading can operate in all these three regimes. 
Furthermore, for a broad class of proofreading protocols, 
we show that error reduction is limited by the chemical 
energy spent in the proofreading reaction. 

RESULTS 

Template-assisted polymerization 

We start our discussion by detailing the stochastic dy¬ 
namics of the template-assisted polymerization process 
sketched in Fig. Its transition network is represented 
in Fig. UK- The rectangles correspond to the states of 
the system after the copying machine finalized incorpo¬ 
ration of a monomer. We denote them with a string such 
as ... rrwr, which refers to a particular sequence of right 
and wrong matches (see also Fig. [^. Dashed circles en- 
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closes sub-networks of n intermediate states, characteris¬ 
tic of the copying protocol. The intermediate states, rep¬ 
resented as blue/green circles for right/wrong matches in 
Fig. [§\, are used by the machine to process a tentatively 
matched monomer and decide whether to incorporate it 
or not. We note intermediate states as .. .rruirri, with 
1 < i < n, and analogously for wrong monomers. A copy¬ 
ing protocol is fully specified by the topology of the sub¬ 
networks, assumed to be the same for right and wrong 
matches, and the kinetic rates for right matches and 
for wrong ones. Differences in the rates are responsi¬ 
ble for discrimination. Possible examples of sub-networks 
of increasing complexity are represented in Fig. [^. 

Because of thermal fluctuations induced by the en¬ 
vironment at temperature T, all kinetic transitions are 
stochastic. The states are thus characterized by time- 
dependent probabilities P{. . .r), P{.. .w), P{. .. Vi) and 
P(.. .Wi). Their evolution is governed by a set of mas¬ 
ter equations which can be solved at steady state, see 
Methods. Key to the solution is to postulate that er¬ 
rors are uncorrelated along the chain, so that P(...) oc 
r]^ (1 — , where N is the length of the chain and 

is the total number of incorporated wrong matches. 
The error 77 can then be determined via the condition 

1 - 7 ; v-{r]) ’ ^ ’ 

where u’’ and are the average incorporation speeds of 
right and wrong monomers, respectively. They represent 
the average net rates at which right and wrong monomers 
are incorporated in the copy. The net elongation speed v 
is the sum of these two contributions, v = v'^ -\- . Sub¬ 

stituting the solution for P(...) into the master equations 
leads to explicit expressions for v'" and n'’ as a function 
of the error and all the kinetic rates. In this way, Eq. 0 
becomes a closed equation for the only unknown 77 . Note 
that Eq. and the definition of v imply v'' = (1 — 77 ) 7 ; 
and = rjv. 


Thermodynamics of copying with errors 

The kinetic rates fcf- and fcf" are determined by the 
energy landscape of the system, the chemical drivings 
/iij of the reactions, and the temperature T of the ther¬ 
mal bath, as represented in Fig. Hh- The chemical driv¬ 
ings represent the difference in chemical potential of re¬ 
actions, such as ATP hydrolysis, fueling the transitions 
j —7. The energy differences of an intermediate state 
respect to the state before the candidate monomer incor¬ 
poration are AE^ = E{.. .Vi) — E{. . and similarly for 
wrong incorporation; the energy changes after finalizing 
incorporation of a monomer are AP’' = E{. .. r) — E{...) 
and analogously for wrong matches. Note that these en¬ 
ergies are in a strict sense free energies as they might 
depend, for example, on the monomer concentrations in 


the cell. Energetic discrimination can be exploited when 
the wrong match is energetically more unstable than the 
right one, AE'^ > AP'". In addition, wrong matches can 
also be discriminated kinetically, i.e. by exploiting dif¬ 
ferent activation barriers Sij in the transitions performed 
by the machine when a right monomer is bound. In gen¬ 
eral, complex copying protocols can combine both these 
mechanisms [la [E]. Full expressions of the rates are 
summarized in Fig. |3J3. 
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k[j = ujij exp[(AEj' + /7,j + 6ij)/T] 

kJJ = Uij exp[(AE7 -I- M/j)/T] 
kji = Wij exp[(A£,'' + 6ij)/T] 
kfi = uJij exp[AE"/T] 


FIG. 3. Energy landscape and kinetic rates. A Ener¬ 
getic diagram of a single transition in the reaction network. 
B Corresponding kinetic rates. The transition j —>■ i can 
be driven by energy differences and the chemical driving g,ij. 
Transitions involving a right and a wrong monomer can be 
characterized by different kinetic barriers 5ij, as well as dif¬ 
ferent energetic landscapes AE'fi 7^ APJ. The bare rate uJij 
is the inverse characteristic time scale of each reaction. 

Given a steady-state elongation speed v, the chemical 
drivings perform an average work per added monomer 

where JE and JE are 
probability fluxes (see also Methods). Further, the free- 
energy change per added monomer at equilibrium would 
be APeq = — Tlog(e“^^ -I- e~^^ /^). In the limit 
7 ; —>■ 0, the system approaches equilibrium and the 
population of all states is determined by detailed bal¬ 
ance. This implies that the equilibrium error is 77eq = 
exp [(—AP™-h APeq)/r]. When driving the dynamics 
out of equilibrium, the error will in general depart from 
its equilibrium value, leading to a positive total entropy 
production. In Methods, we derive that the total entropy 
production per copied monomer and the error are linked 
by the relation 

rAPtot=AlE-APeq-TP(77||77eq)>0 , (2) 

where P(77| |77eq) = 77 log(77 / 77eq) -b (1 - 77 ) log[(l - ?7)/(l - 
77 eq)] is the Kullback-Leibler distance between the equi¬ 
librium and non-equilibrium error distribution, which is 
always non-negative and vanishes only for 77 = rjeq. Eq. 
[^states that the average performed work is greater than 
the equilibrium free energy increase by a configurational 
bound, AW — APeq > T P( 77 ||? 7 eq) > 0. In this view, 
the Kullback-Leibler term in Eq. can be interpreted as 
the additional free energy stored in a copy characterized 
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by an error different from its equilibrium value. This ad¬ 
ditional free energy can be recovered by a spontaneous 
depolymerization process that will stop once the system 
reaches its equilibrium error [S]. 

Eq. (U) relates the information content of the copy 
with thermodynamics. However, in many relevant cases, 
the entropy production is dominated by the “excess 
work” AW — AFgq, so that in practice Eq. ([^ reduces 
to the traditional form of the second law. Consider for 
example a case in which error correction is very effective, 
V ^ Veq- In this limit, the Kullback-Leibler term tends 
to a constant, D{r]\ |? 7 eq) —>■ — log(l — > 0. Since usu¬ 

ally the equilibrium error is already small, this constant 
is also small, D{ri\\r]eq) « rjeq <C 1. The reason is that, 
since errors are typically rare, their overall contribution 
will be small. 

To better understand the link between errors and ther¬ 
modynamics, we consider the average entropy production 
associated with an error incorporation, 
where is the entropy production rate coming from 
incorporation of wrong monomers only. The quantity 
also obeys a second-law-like inequality 

TASZt = AVE*" - AFeq - T log(r?/,?eq) > 0, (3) 

where AW'" = i® Ihe average work per¬ 

formed per wrong match (see Methods). Rearranging 
terms in Eq. ([^ yields a general expression for the error 
in terms of thermodynamic observables 

7J = Tj,q exp [-A5“t + (AIE“ - AFeq)/r]. (4) 

This result does not depend on microscopic details of the 
copying protocol, such as the discrimination barriers 6ij . 
Eq. @ provides a direct link between thermodynamic 
irreversibility and accuracy of copying. It states that, 
given a fixed work budget, reduction of the error beyond 
its equilibrium value always comes at a cost in terms 
of entropy production. However, the dependence of the 
error on the thermodynamic quantities is non-trivial to 
derive from Eq. as varying the work also affects the 
entropy production. 

The inequality of Eq. ^ reveals the existence of three 
possible copying regimes: 

1. Error amplification, AW'" — AFeq > 0 and p > rj^q. 
In this regime, a positive excess work for wrong 
matches leads to an error higher than its equilib¬ 
rium value. While, in this case, dissipating energy 
is counterproductive in terms of the achieved error, 
it can be justified by the need of achieving a high 
copying speed. 

2. Maxwell demon, AW'" — AF^q < 0 and rj < 

rjeq- In this regime, the machine extracts work 
while lowering the information entropy of the chain 
with respect to its equilibrium value, —rilog(ri) < 
—ryeq log(? 7 eq). This regime is reminiscent of a 





FIG. 4. Template-assisted polymerization without in¬ 
termediate states. A Excess work AW — AF^q, entropy 
production and Kullback-Leibler term of Eq. ([^ as a func¬ 
tion of the error. Notice that the excess work dominates over 
the information term. B Same terms as in A, but for wrong 
monomers only. In this case, the information term dominates 
the entropy production. C Relation between error and en¬ 
tropy production of wrong monomers, together with thermo¬ 
dynamic (red, dashed) and hardware-specific (black, dashed) 
bounds. In all panels, the driving pro is varied to vary the 
error. Parameters are dio = lOT, AEi = 0, AE™ = 3T. 

Maxwell demon, since an apparent violation of the 
second-law-like inequality, Eq. occurs from ne¬ 
glecting entropy production associated with infor¬ 
mation manipulation (see e.g. [20]). Note, however, 
that the excess work associated to right matches 
compensates this term, so that growth of a copoly¬ 
mer can not result in AW — AF^q < 0, see Eq. 

3. Error correction, AW'" — AFf.q > 0 and 77 < rj^q. 
This is an error-correction scenario in which work is 
dissipated to achieve an error lower than the equi¬ 
librium error. In this case, which is the most com¬ 
mon for biological machines, Eq. @ implies a sim¬ 
ple bound on the error, 77 > r]eqexp{—ASZt)- 

Given the copying protocol and the kinetic rates, the 
copying machinery will achieve a certain error 77 and op¬ 
erate in one of these three regimes. Varying the kinetic 
rates affects both the error and the thermodynamic ob¬ 
servables, possibly shifting the operating regime of the 
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machine. To better scrutinize these aspects, we now move 
to considering specific protocols. 

In the simplest possible example, incorporation occurs 
in a single step, as sketched on the top panel of Fig. 

(see also [H [TTHT?1 [T5] ). It can be shown that this proto¬ 
col is always dissipative, AIF“ — AFeq > 0. In general, 
wrong monomers can be discriminated by a kinetic bar¬ 
rier <5io and an energy difference AE'^ — AE'^ [T3]. If 
the kinetic barrier is larger than the energy difference, 
i5io > AE^ — AE^, it can be shown that rj < r]eq, corre¬ 
sponding to error correction. If it is lower, then rj > 
which corresponds to error amplification |13j . In Fig. |4]4 
we plot the different terms of the total entropy produc¬ 
tion, Eq. ([^, for the error correction case. As discussed 
before, the information contribution to the total entropy 
production is negligible for small errors. Instead, note in 
Fig. 1^ that the information term of Eq. (§ dominates 
over the work performed per wrong matches. This im¬ 
plies that the universal expression for the error, Eq. , is 
very well approximated by the lower bound of error cor¬ 
rection, as shown in Fig. HP- The error departs from this 
bound only when it approaches its hardware-specific min¬ 
imum 77min ~ Note that increasing dio decreases 

both Tj^in and the dissipative cost ASfi^^ of copying at an 
error rate rj > rj^in- 


Energetic bound to proofreading accuracy 

In kinetic proofreading, a copying pathway that incor¬ 
porates monomers at a speed > 0 is assisted by a paral¬ 
lel pathway which preferentially removes wrong matches 
at a speed Vp < 0, see Fig. (§^)- Hereafter the sub-index 
“p” indicates that quantities are computed only for the 
proofreading reaction. To maintain a negative speed, the 
proofreading reaction must be driven backward either by 
performing a work per added monomer AWp , or by ex¬ 
ploiting a high free energy difference AEeq between the 
final and the initial state. By means of proofreading, one 
can achieve lower errors than those of the copying path¬ 
way alone, at the cost of spending additional chemical 
driving and reducing the net copying speed v = Vp. 

We consider a proofreading protocol consisting of a 
copying pathway with one intermediate step in addition 
to the proofreading reaction, see middle panel in Fig.[^. 
By tuning the rates, this model can operate in all three 
regimes described in the previous section, as shown in 
Fig. [^. In particular, in the Maxwell demon regime, the 
error can be reduced up to one order of magnitude below 
its equilibrium value while at the same time extracting 
work from the wrong copying reaction. Very small errors 
are achieved in a strongly driven error correction regime, 
where the error rate satisfies 77 > 77 eq exp(—AS'^j). How¬ 
ever, at variance with the example of the previous section, 
here the entropy production becomes quickly much larger 
than this bound. The reason is that effective proofread¬ 



proofreading, ir < 0 ■ 


elongation 
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proofreading work, AH/p (ksT) 


FIG. 5. Regimes and bounds of kinetic proofreading. 

A Scheme of a generic proofreading scheme. Copying occurs 
at a net speed Uc > 0 through an arbitrary reaction network of 
intermediate states. After the copy is finalized, a proofreading 
reaction removes errors at a speed Up < 0. The net average 
speed is a = Uc-I-Up > 0 B. Thermodynamic regimes of kinetic 
proofreading. The model combines a copying scheme with one 
intermediate state with kinetic proofreading, as represented 
in Fig. 1^). The shaded regions denote the three thermody¬ 
namic regimes discussed in the previous section. Parameters 
are 5io = 5T, S 21 = 0, <5o2 = 5T, AE^ = AEf = 2T, 
AE 2 = AEl = 0. We remind that states 0, 1, and 2 repre¬ 
sent the state before monomer incorporation, the intermediate 
state, and the final state where monomer has been incorpo¬ 
rated, respectively (see also Fig. [^). For each value of the 
error rj, the other free parameters {jJ.10, M21, ‘^21, ^02) are 
determined by minimizing the entropy production per copied 
wrong monomer AS^t- C Minimum error as a function of 
the proofreading work AWp = /io 2 . For each curve, energies 
and activation barriers are fixed parameters as in the previous 
panel (except for S02 which varies, as in the captions). For 
each value of jj.02, the other free parameters {nio, M21, <^21, 
UJ02) are determined by numerically minimizing the error rj. 
Red-dashed and black-dashed lines represent thermodynamic 
and hardware-specific bounds, respectively. 


ing requires a cycle in the reaction pathway which funda¬ 
mentally involves dissipation of work. This dissipation, 
rather than the information term, dominates the entropy 
production of wrong matches at low errors. This is at 
variance with the single-step model of the previous sec- 
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tion, where no cycles are present and the configurational 
entropy dominates over dissipation. 

To derive a better estimate of the error in proofread¬ 
ing, we now focus on the rate of entropy production 
during the proofreading of wrong matches = 

-v^AW^ - + T\og{r]/r]eq)]. Using that in 

proofreading Up < 0 while > 0, we can derivethe 

following bound for the error 


r] > ryeq exp 


^ AfUp + Afeq ^ 


( 5 ) 


where we have further used that AWp = AWp (see 
Methods for details) . This equation is one of the main 
results of this paper. It shows that error reduction in 
proofreading is limited by its energetic cost, either in the 
form of chemical work in the proofreading pathway |19j 
or free energy of the final state, which involves perform¬ 
ing work in the copying pathway [4j. Similarly to Eq. 

this bound does not depend on details of the copy¬ 
ing protocol. In Fig. [^, we show the error of the specific 
proofreading model of Fig. as a function of the proof¬ 
reading work. One can appreciate that the bound from 
Eq. is tightly met for a wide range of errors. For very 
small values of AWp, when Up > 0 and no proofreading 
occurs, the bound is not satisfied. Finally, for very large 
work values, the error approaches the hardware-specific 
minimum 

In this case, the value of rymin can be obtained from the 
explicit solution of the model (see derivation in Methods). 
In the strongly driven regime, the error rj decreases at 
increasing proofreading work AWp. At the same time, 
Up becomes more negative as more copies are proofread. 
The minimum error is thus obtained in the limit of van¬ 
ishing elongation speed, when the proofreading speed is 
negative enough to arrest copying, Vp = —Vc- Imposing 
this condition gives the hardware-specific minimum error 

n . ~ „( —^10+^02 —Ai3”'-|-A£;'’)/T 

•Imm ~ t: . lUJ 


This expression shows that the error of the first copy¬ 
ing step, approximatively equal to because 

of the large kinetic barrier, is reduced by a factor 
g(5o2-AB -i-AB )/T additional discrimination 

of the proofreading reaction. 


DISCUSSION 

In this paper, we analyzed template-assisted polymer¬ 
ization, where copies are cyclically produced by an ar¬ 
bitrary complex reaction network. This broadly extends 
Bennett’s original copolymerization model [8] and further 
studies [SHE]) where monomer incorporation occurs in a 
single step. In particular, the results presented here allow 
for analyzing the thermodynamics of realistic biological 


copying protocols, where a complex reaction network is 
responsible for error correction. 

At variance with models for the copy of a single 
monomer HHt], in template-assisted polymerization the 
number of possible states of the chain grows exponen¬ 
tially at steady-state. This exponential increase causes 
the appearance of an information term in the formula 
for the total entropy production, Eq. A similar term 
appears in the context of Landauer principle out of equi¬ 
librium |21] , and was interpreted as the amount of infor¬ 
mation necessary to shift from the equilibrium distribu¬ 
tion to the non-equilibrium one. Eq. should not be 
confused with a formally similar one derived by Gaspard 
and Andrieux [^, whic represents a physically different 
quantity, i.e. the entropy of the copy given the template. 
This difference is physically important: the information 
term in Eq. [^can be thought of as a measure of distance 
from equilibrium, as it is equal to zero at equilibrium and 
positive otherwise. In contrast, the information term in 
Gaspard and Andrieux’s formula goes to zero only in the 
limit of vanishing error rate. 

The main result of this paper is that, thanks to the 
explicit dependence on the error, the second law of ther¬ 
modynamics can be used to obtain general expressions 
and bounds on the copy error. This allows us to identify 
three different copying regimes: error amplification, er¬ 
ror correction, and Maxwell demon, all of which can be 
achieved by kinetic proofreading. 

Considering cyclic copying is analogous to consider¬ 
ing cyclic transformation when studying the efficiency 
of thermodynamic engines. Besides being the natural 
choice to properly describe the thermodynamics of the 
process, template-assisted polymerization allows for out- 
of-equilibrium copying regimes which are absent in single¬ 
monomer models. For example, a lower bound to the 
error analogous to Eq. [^is generally valid in closed net¬ 
works 1211 m] . In template-assisted polymerization, this 
limit can be broken when the proofreading reaction re¬ 
verts its flux, as seen in Fig. HP for small values of the 
work. 

We briefly discuss the relevance of our results for in¬ 
terpreting experimental data. Many biological copying 
pathways are driven by the hydrolysis of one single GTP 
molecule. The chemical work spent in this process is 
A/r = A^° -|- feeTlog ^ [G^op^lPi ] ) • Taking as reference 

the bare potential of ATP, A/x° = Id.SfceT, and typical 
concentrations [GTP] = ImM, [GDP] = O.OlmM and 
[Pi] = ImM, we obtain A^qtp ~ 20A:bT. In a protocol 
involving proofreading, this information and Eq.[^can be 
used to set a lower bound for the error. Assuming that 
the energy of GTP is all spent to increase the free energy 
of the chain, AF « Auqtp, we obtain that the total 
error reduction is rj/rjeq > 10“®. The value of this bound 
is smaller than typically observed errors, which reason¬ 
ably suggests that not 100% of the energy of hydrolysis 
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is utilized to increase the free energy of the system. 

Given the flexibility of our framework, many com¬ 
plex copying mechanisms studied in the literature as 
non-cyclic processes [TTHI^ can be directly considered 
as template assisted polymerization problems and stud¬ 
ied from the point of view of thermodynamic efficiency. 
One limitation of our treatment is the lack of long-term 
memory: while processing a monomer, the machine does 
not keep track of the past errors encountered along the 
chain. A more general scheme could exploit correlations 
in the template sequence to reduce the error. An ex¬ 
ample of this is backtracking [^^26) . where regions of 
the template containing many errors are entirely repro¬ 
cessed. Generalization of template-assisted polymeriza¬ 
tion to these cases will be the subject of a future study. 

The thermodynamic relations derived in this paper 
fundamentally limit the capabilities of stochastic ma¬ 
chines to reduce and proofread errors, and are reminis¬ 
cent of similar bounds derived for adaptation error in 
sensory systems [27] It will be of interest to understand 
whether our results can be applied to error correction 
in sensing. For example, it is known that sensory path¬ 
ways exploit proofreading both in chemosensing by iso¬ 
lated receptors [55] or cooperative ones [55]. Clarifying 
the links between these problems will constitute an im¬ 
portant step towards formulating general thermodynamic 
principles [30j limiting the accuracy of non-equilibrium 
information-processing. 


METHODS 

Steady-state solution of template-assisted 
polymerization 

In this section, we briefly outline how to solve the 
template-assisted polymerization model. We start by 
writing the master equations governing the evolution of 
probabilities of all main states P {..and those of the 
intermediate states P{.. .Vi) and P{.. .Wi). The prob¬ 
ability flux between two arbitrary intermediate states 
... rj and ... n is ..) = kl^P{... r^) - ^,P{... n), 
and analogous for wrong matches (see Fig. The mas¬ 
ter equations for the intermediate states can be expressed 
in a compact form in terms of these fluxes 

n+1 n+1 

P{...r,) = J2j[^{...) , P(...a;,) = E'^“(---)(7) 

j=o j=o 

where the upper dot denotes time derivative. Note that 
the sum extends to j = 0 and j = n -|- 1, which with an 
abuse of notation correspond to the main states neighbor¬ 
ing the network of intermediate states, ... tq = ■. ■ wq = 
... , ... r„+i = ... r and ... = .. .w. Master equa¬ 

tions for main states are easily written by distinguishing 
states ending with a wrong match from those ending with 


a right match 

n+1 

n+1 

P(... r) = ^ [++ 1 , (■..) - +’o(- ■ • 0 - +“o(- ■ • 0 ] ( 8 ) 
j=o 

where the three sets of fluxes in each equation correspond 
to finalized incorporation of the last monomer in the main 
state, and attempted incorporation of a right and wrong 
monomer. Eqs. 0 are similar to those written for bio¬ 
chemical models, while Eqs. Q are similar to those used 
for polymer growth. 

The system of equations Q and Q can be solved at 
steady state, P — 0, hy means of the ansatz that errors 
are uncorrelated. Given an error rj, to be determined a 
posteriori, we impose that the steady-state probability of 
a string of length N with N'^ errors is P(...) cx (1 — 
r])N~N _ This implies 

P(... r) = P(... )(1 — 77 ) and P{.. .w) = P{...)?] . (9) 
For the intermediate states we make the additional ansatz 


P{...ri)=P{...)pl and P{.. .Wi) = P{.. .)p':" , (10) 


where pi and pf are the occupancies of the intermediate 
states 1 < i < n, assumed to be independent of P{.. 

Substituting Eqs. [^ and IT in [^ yields a system of 
2 n linear equations, from which the occupancies can be 
expressed as functions of the kinetic rates and the error 
r], still to be determined. It is now convenient to define 
the occupation fluxes +■ as 


+ =AA(+T,-+.k) , ( 11 ) 

where Af = [1 + + pDI ^ is a normalization 

constant such that i ■ ■ ’^i) — 

thus P{- ■ ■) = A/- Occupation fluxes are related to 
the probability fluxes via Jij{- • ■) = P{. . ■ )Jlj/Af and 
analogously for wrong matches. The speed of right and 
wrong monomer incorporations can now be expressed as 

= E. ++H = E. and = E, Jn+u = E. 
Replacing the ansatzin Eqs. [^and using these definitions 
results in Eq. [^ which can be finally used to determine 
the error. 


Entropy production rate 


To calculate the steady-state entropy production rate, 
we start with the general expression m 


Stat = 



+E---)iog 


\kjiPi---n) J 


++“(••-(log 


/ fcgP(...u;,-) \ 
\kfiPi---W^) J 


( 12 ) 







We now factorize the sum into one over strings (noted 
) and one over intermediate states (where (ij) de¬ 
notes links). Using the definition of the occupation 
fluxes, Eq. El we obtain: 


^tot = 


E n-- 


■E 

(d) 


JI,log 


k^nPl 


+ J-log 


7^7 


UWrrM 


(13) 


Since the sum over all states is normalized to one, we 
have that = [1 + ELi(Pr+Pr)]”^- Using 

the definition of Af in previous section, the term outside 
the brackets is equal to 1. Substituting the definition of 
the rates of Fig. ([^ into yields 




Y^iJp + jrM^/T+Y^jpiog 

(ij) (b> 



+ E •'S ‘«s (f) + 

(ij) ^ {ij) 


+ ^J“(AU;-AU[)/r . (14) 

{ij) 


For an isolated network at steady state, all terms but 
the first one vanish by flux conservation m- However, 
in cyclic copying the states z = 0 and i = n + 1 receive 
a finite flux from the rest of the transition network, see 
Fig. [^. Using J2j Jlj = 0 for 1 < z < n, the definitions 
of u’’ and v '^, and Eq. we obtain 

5 tot = + JTj)t^r3/T - Z7^[l0g (z?) + AU-/T] 

{ij) 

- (1 - ?7)z;[log(l - 77)-f AU^’/T] ( 15 ) 


Using the definition of equilibrium error and free energy 
difference per step given in Results, we arrive at 


r,Stot=zz[AlU-AUeq-TU(zy||??eq)] ■ (16) 


Defining the entropy production per step as AS'tot = 
<5tot/'y leads to Eq. 

Eq. 0 can be derived following the same proce¬ 
dure, but considering the contribution to the entropy 
production coming from incorporation of wrong matches, 

= iE...«^T(--01og[fcSU(..._zc,)/(fc-P(...zcO] 

from which we also define AS^j = Note that 

S{^ot ^ 0, since all terms of the sum in its definition are 
non-negative. 


Thermodynamic bound for proofreading 

In copying schemes assisted by kinetic proofreading the 
proofreading reaction removes incorporated monomers 
at an average speed Up = J™q_i 0 + Jn+i oi where 


the subindex “p” denotes quantities that correspond to 
the proofreading reaction. The average proofreading 
speed can be written as a sum of contributions coming 
from right and wrong monomers zzp = Up -|- zZp < 0. 
Proofreading is fueled by a chemical driving ^0 n+i, 
which is the same for right and wrong matches (we 
remind that the proofreading reaction is driven back¬ 
ward). By direct substitution, one can show that 
the average work per proofread monomer is AWp = 
AWp = AkUp = fj-o n+i According to our convention, 
monomer removal corresponds to Up < 0. In an effec¬ 
tive proofreading scheme, errors are removed on aver¬ 
age, Up = Jn+i 0 < 0- Consider now the entropy pro¬ 
duction rate of proofreading wrong monomers, = 

'E-t-i 0 log[(Po ^n+1 o)/iPn+lko „+i)]. As every term of 
^tot, this quantity satisfies a second-law-like inequality 
5p tot > 0. By means of this inequality, and using Up < 0, 
Pq = 1 andp)C_|_t = rj, we obtain the general proofreading 
bound of Eq. 


Solution of the proofreading model 


To solve the proofreading protocol in Fig. [^, we start 
from Eqs. which at steady state imply Jfg — J 21 = 0 
and JiQ — J 21 = 0. Solving for the probabilities of the 
intermediate states yields p\ = (A:[g -I- (I — z?)^i 2 )/(^oi + 
^ 21 ) and Pi = {kiQ + fc)" 2 ^)/(^oi + ^^i)- Uhe speed of 
incorporation of right and wrong monomers are u’" = 
J\f[k^o + k^ipr - vik^2 + K 2 )] and = Af[k-20 + kliP\ - 
(1 — ij){k \2 + ^ 02 )]) where Af is the previously defined 
normalization constant. Substituting these expressions 
in Eq. yields 


V _ k^o + k^iPr - vikr2 + K 2 ) 
l-Tj k^Q + k^iPl- {l-p){k{2 + kli2) 


(17) 


which can be easily solved for the error rj. 

To scrutinize the effectiveness of proofreading, we 
parametrize the rates as in Fig. [^. Considering the 
strongly-driven regime P 2 i,Po 2 ^ 1; Eq. ([T7|) becomes 




^2iPi - ?7Wo2e 


(U02 —Z‘21-|-AB”')/T 


1 - 77 072177^ - (1 - ?7)w02e('^O2-/i2i+A£)'-+<5o2)/T 

(18) 

From Eq. one can deduce that the error 77 is a 

decreasing function of the combination of parameters 
K = (a;o2/w2i)e*'^“^“^^iU2"^ which tunes the intensity of 
proofreading. However, increasing K also increases the 
absolute value of the proofreading speed Up = Af[k 2 Q + 
^20 ~ ^02 (l-77)-k ^ 02 ^ 7 ]) so that K can be increased only 
up to a point where the net elongation speed vanishes. 
Finding the maximum value of K by the condition u = 0 
and substituting in Eq. (18) leads to Eq. In this case, 
771 is determined by the large kinetic barrier 771 » , 

Le.g. nniE]. 
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